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Abstract Transform Invariant Low-rank Textures (TILT) 
is a novel and powerful tool that can effectively rec- 
tify a rich class of low-rank textures in 3D scenes from 
2D images despite significant deformation and corrup- 
tion. The existing algorithm for solving TILT is based 
on the alternating direction method (ADM). It suffers 
from high computational cost and is not theoretically 
guaranteed to converge to a correct solution to the in- 
ner loop. In this paper, we propose a novel algorithm 
to speed up solving TILT, with guaranteed convergence 
for the inner loop. Our method is based on the recently 
proposed linearized alternating direction method with 
adaptive penalty (LADMAP). To further reduce com- 
putation, warm starts are also introduced to initialize 
the variables better and cut the cost on singular value 
decomposition. Extensive experimental results on both 
synthetic and real data demonstrate that this new al- 
gorithm works much more efficiently and robustly than 
the existing algorithm. It could be at least five times 
faster than the previous method. 
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1 Introduction 

Extracting invariants in images is a fundamental prob- 
lem in computer vision. It is key to many vision tasks, 
such as 3D reconstruction, object recognition, and scene 
understanding. Most of the existing methods start from 



low level local features, such as SIFT points (Schindler 



et al 2008), corners, edges, and small windows, which 



are inaccurate and unrobust. Recently, Zhang et al. (Zhang 



et al 2012b) proposed a holistic method called the 



Transform Invariant Low-rank Textures (TILT) that 
can recover the deformation of a relatively large im- 
age patch so that the underlying textures become reg- 
ular (see Figures [I] (a) and (b)). This method utilizes 
the global structure in the image patch, e.g. the regu- 
larity, such as symmetry, rectilinearity and repetition, 
that can be measured by low rankness, rather than the 
low level local features, hence can be very robust to 
significant deformation and corruption. 

TILT has been applied to many vision tasks and 
been extended for many computer vision applications. 



For example, Zhang et al. (Zhang et al 2011b) used 



TILT to correct lens distortion and do camera calibra- 
tion, without detecting feature points and straight lines. 



Zhang et al. (Zhang et al 2012a) applied TILT to rec- 



tify texts in natural scenes to improve text recognition 



on mobile phones. Zhao et al. (Zhao and Quan 2011)) 
proposed a method for detecting translational symme- 



try using TILT. Zhang et al. ( | Zhang et al 2011a) fur- 
ther generalized the transforms allowed by TILT to 
polynomially parameterized ones so that the shape and 
pose of low rank textures on generalized cylindrical sur- 
faces can be recovered. Mobahi et al. (Mobahi et al 



20111 used the low rank textures recovered by TILT 



as the building block for modeling urban scenes and 
reconstructing the 3D structure. 
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Fig. 1 TILT for rectifying textures. Left column: The orig- 
inal image patches, specified by red rectangles, and the trans- 
forms found by TILT, indicated by green quadrilaterals. Right 
column: rectified textures using the transforms found. Top 
row: results by our method for solving TILT. Bottom row: 
results by the original method in (Zh ang et al| |2012b[ ) for 
solving TILT. The original method does not converge to a 
correct solution. (Images in this paper are best viewed on 
screen!) 



TILT was inspired by Robust Alignment by Sparse 



and Low-rank decomposition (RASL) ( Peng et al . 2010 ), 
which has been successfully applied to align faces and 
video frames. TILT and RASL share the same mathe- 
matical model, namely after an appropriate transform 
a data matrix can be decomposed into a low rank com- 
ponent and a sparse one (see 0). The only difference 
between TILT and RASL is that the data matrix in 
TILT is an image patch, while that in RASL is multi- 
ple images or frames, each being a column of the matrix. 
So in the sequel, we just focus on TILT. 

The existing most efficient solver for TILT is based 



on the alternating direction method (ADM) (Zhang 



et al 2012b Lin et al 2009 1 . It has been adopted by all 



the researchers that use TILT for their problems. How- 
ever, it still requires multiple seconds to rectify a small 
sized image patch, making the applications of TILT 
far from being interactive. Another critical drawback 
of the existing solver is that it was derived out of intu- 
ition by simply mimicing the ADM for the robust prin- 
cipal component analysis (RPCA) problem presented 



in ( |Lin et al| 2009). In the literature of ADM in the 
Gauss-Seidel fashion, almost all the convergence results 
were proven under the case of two variables, while the 
inner loop of TILT problem (see pi) has three vari- 
ables. Hence the convergence of ADM for the inner 
loop of TILT is not theoretically guaranteed. In our 
experiments, we did find many examples that the ex- 



isting solver for TILT failed to produce a correct so- 
lution (e.g., see Figures [TJc), [TJd), [2j [3] and El). Conse- 
quently, to make the algorithm workable, its parameters 
have to be carefully tuned in order to trade off between 
convergence speed and robustness at the best (c.f. last 



paragraph of Section 2.2), which is difficult for differ- 
ent applications. The above drawbacks of the existing 
algorithm motivated us to design a more efficient al- 
gorithm for TILT, with a theoretical guarantee on the 
convergence of its inner loop. 

We observe that the original method does not al- 
ways converge to a correct solution because the inner 
loop of the original TILT problem have three variables. 
This motivates us to cancel one of the variables and 
solve an equivalent TILT problem that has only two 
variables in its inner loop. Then using the recently de- 
veloped linearized alternating direction method with 



adaptive penalty (LADMAP) (Lin et al 2011 1 and some 
good properties of the annihilation matrix, the inner 
loop can be solved efficiently and with a convergence 
guarantee. We further speed up the computation by 
warm starts in two ways. First, we initialize the vari- 
ables in the inner loop by their values when they exit 
the previous inner loop. This gives the variables very 
good initial values. So the number of iterations in the 
inner loop can be greatly cut. Second, as singular value 
decomposition (SVD) is the major bottleneck of com- 
putation, we also solve SVD by warm start, where the 
singular vectors and singular values are initialized as 
those in the previous iteration. The update of SVD is 
also based on a recently developed technique of opti- 



mization with orthogonality constraints (Wen and Yin 



2012). As a result, our new algorithm, called LADMAP 
with warm starts (LADMAP+WS), can be at least five 
times faster than the original method and has a con- 
vergence guarantee for the inner loop. 

The rest of this paper is organized as follows. In 
Section [2] we review the TILT problem and the ADM 
method for solving it. In Section [3] we introduce the 
LADMAP method for solving the inner loop of TILT 
and the variable warm start technique. In Section [4j we 
present some important details of applying LADMAP 
to TILT so that the computations can be made effi- 
cient. In Section [5j we show the warm start technique 
to compute SVD. Experimental results on both simu- 
lated and real data are reported in Section [6j Finally, 
we conclude our paper in Section [7J 



2 Transform Invariant Low-rank Textures 

In this section, we first briefly review the mathematical 
model of TILT. Then we introduce its corresponding 
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convex surrogate and the existing ADM based method 
for solving it. 



the inner loop for solving TILT. When the increment 
At is computed, the transform is updated as 



2.1 Mathematical Model 

Consider a 2D texture as a matrix A g ]R mx ™. It is 
called a low-rank texture if r <C min(m, n) , where r = 
rank(j4). However, a real texture image is hardly an 
ideal low-rank texture, mainly due to two factors: 1. it 
undergoes a deformation, e.g., a perspective transform 
from 3D scene to 2D image; 2. it may be subject to 
many types of corruption, such as noise and occlusion. 

So if we could rectify a deformed texture D with a 
proper inverse transform t and then remove the cor- 
ruptions E, then the resulting texture A will be low 
rank. This inspires the following mathematical model 



for TILT (Zhang et al 2012b) 



min rank(A) + A||-E||o, 
a,e,t v ' 1 " 



s.t. Dot = A + E, 



(1) 



l 2 belongs to a certain group of trans- 
affine transforms, perspective transforms, 



where r : M 2 
forms, e.g. 

and general cylindrical transforms (Zhang et al 2011a), 
||i?||o denotes the number of non-zero entries in E, and 
A > is a weighting parameter which trades off the 
rank of the underlying texture and the sparsity of the 
corruption. 



2.2 The Existing Method for Solving TILT 

The above problem ([TJ is not directly tractable, be- 
cause the rank of a matrix and the ^o _n orm are discrete 
and nonconvex, thus solving the optimization problem 
is NP-hard. As a common practice, rank and £o~ nornl 



An 



In the following, for brevity when we are focused on the 
inner loop, for simplicity we may omit the superscripts 
i + 1 or i of variables. This should not cause ambiguity 
by referring to the context. 



Zhang et al. ( Zhang et al 2012b I proposed an ADM 
based method to solve ([3]). It is to minimize the aug- 
mented Lagrangian function of problem (J3j) : 

L(A,E,At,Y,») 
= \\A\\, + XWEWi + (Y, D o T + JAt-A-E) 



-%\\Dot- 



JAt-A-E\\ 2 f , 



with respect to the unknown variables alternately (hence 
the name ADM), where Y is the Lagrange multiplier, 
(A, B) = tr(A T B) is the inner product, || • \\f is the 
Frobenius norm, and p > is the penalty parameter. 



The ADM in (Zhang et al 2012b) goes as follows: 



A k+ i 


= argmin£(A, E k ,Ar k , Y k ,fj, k ), 

A 


(4) 


Ek+i 


= argmin L(A k+ll E, AT kl Y kl fi k ), 

E 


(5) 


&T k+1 


= argmin L(A k+1 , E k+1 , AT,Y k , p, k ), 

At 


(6) 


Y k+1 


= Y k + p, k (Do T + JAr k+1 - A k+1 - £ fe _ 


-i), 









where p > 1 is a constant. All subproblems (|4|)-(|6|) have 
closed form solutions (|Zhang et al 2012b) (cf. (24) and 



(26)). More complete details of the algorithm can be 



found as Algorithm 1 in (Zhang et al 2012b I. 



2004 1 (the sum of the absolute values of entries) , respec- 



tively. This yields the following optimization problem 



with a convex objective function (Zhang et al 2012b): 
AlLElli, s.t. Dot 



min \\A\ 

A,E,t 



A + E. 



(2) 



Although the above ADM in the Gauss-Seidel fash- 
could be replaced by the nuclear norm flCandes et al| i on empirically works well in most cases, there is no the- 
20TT| (the sum of the singular values) and f i-norm ponoh p retical guarantee on its convergence. There are two fac- 
tors that may result in its non-convergence. Namely, all 
the previous convergence results of ADM in the Gauss- 
Seidel fashion were proven under the conditions that 
the number of unknown variables are only twd^ and 
the penalty parameter is upper bounded. In contrast, the 
inner loop of TILT has three variables and its penalty 
parameter is not upper bounded. As one will see, the 
naive ADM algorithm above may not produce a correct 
solution. In particular, the choice of p is critical to in- 
fluence the number of iterations in the inner loop and 
the quality of solution. If p is small, there will be a lot 

1 For the case of more than two variables, ADM with some 
appropriate modifications, such as introducing a dummy 
variable and treating all the original variables as a super 
block (Bertseka s and Tsitsiklisl |1997[) or introducing a Gaus- 
sian back substitution ( |He et al||2012| ), can be proven to con- 
verge. 



The above problem is not a convex program as the 



constraint is nonconvex. So Zhang et al. (Zhang et al 



2012b ) proposed to linearize for at the previous t 1 
as D o [t % + At) w D o t % + J At and determine the 
increment At in the transform by solving 



min IIAII 

A,E,At 



■A||£||i, s.t. Dot 1 + J At = A + E, (3) 



where J is the the Jacobian: derivative of the image 
with respect to the transform parameters. J is full col- 
umn rank. We call the iterative procedure to solve ([3| 
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of iterations but the solution is very likely to be cor- 
rect. If p is large, the number of iterations is small but 
an incorrect solution may be produced. So one has to 
tune p very carefully so that the number of iterations 
and the quality of solution can be best traded off. This 
is difficult if the textures are different. So in this paper 
we aim at addressing both the computation speed and 
the convergence issue of the inner loop in the original 
method. 



3 Solving the Inner Loop by LADMAP 

In this section, we show how to apply a recently de- 
veloped method LADMAP to solve the inner loop of 
TILT. We first reformulate ^ so that At is canceled 
and hence LADMAP can be applied. 



resulting in the following updating scheme: 

x fc+ i = argmm/(x) H — x - x fe 

x 2 

+A*{ lk + /i fc M(xfc) + B(y fc ) - c))/(^^)|| 2 , (11) 

yfe+i = argmins'(y) H — y - y k 

y 2 

+B*{ lk + p k {A(x k+1 ) + B{y k ) - c))/(^ kVB )\\ 2 7 (12) 



and A is still updated as ( 10 1, where i]a > and r\B > 
are some parameters. Then one can see the new sub- 



problems (111 and (12) can have closed- form solutions 



again when / and g are norms. Lin et al. also proposed 
a strategy to adaptively update the penalty parameter 
/i [^] and proved that when p is non-decreasing and up- 
per bounded, and r\ A > \\A\\ 2 and r\ B > \\B\\ 2 , (xfc,yfc) 
converges to an optimal solution to ([7]), where ||„4|| and 
||B|| are the operator norms of A and B, respectively. 
For more details, please refer to ( |Lin et al 2011 ). 



3.1 Sketch of LADMAP 



3.2 Reformulating the Inner Loop of TILT 



LADMAP (Lin et al 2011 ) aims at solving the following 



type of convex programs: 



min/(x) + <?(y), s.t. A(x) + B(y) = c, 



(7) 



where x, y and c could be either vectors or matrices, 
/ and g are convex functions, and A and B are linear 
mappings. 

If the original ADM is applied to Q , x and y are up- 
dated by minimizing the augmented Lagrangian func- 
tion of Q, resulting in the following updating scheme: 

x fc+ i = argmin/(x) 

X 

+ ^\\A(x) + B(y k )-c + lk /p k \\ 2 , (8) 

yfc+i = argming(y) 
y 

+ ^||£(y)+^(x fe+1 )-c + 7fc / Alfc || 2 , (9) 
7fe+i = Ik + p k [A(x k+1 ) + B(y k+ i) - c], (10) 

where 7 is the Lagrange multiplier and p is the penalty 
parameter. In many problems, / and g are vector or 
matrix norms, and the subproblems Q and ([£]) have 
closed-form solutions when A and B are identity map- 



pings ( 


Cai et al| 


2010 


Donoho 


1995 


Liu et al 


2010 


Yang and Zhang 


2011 


1 , hence easily solvable. However, 



when A and B are not identity mappings, subproblems 
{[sj) and ([9]) may not be easy to solve. So Lin et al. (Lin 



et al 2011[ ) proposed linearizing the quadratic penalty 



term in the augmented Lagrangian function and adding 
a proximal term in ^ and ^ for updating x and y, 



As almost all existing convergence results of ADM or 
linearized ADM in the Gauss-Seidel fashion are proven 
in the case of two variables, while the inner loop of 
TILT has three variables, we aim at canceling one vari- 
able by taking advantage of the special structure of the 
problem. 

We notice that At does not appear in the objective 
function of ([3| and it is easy to see that if (A* .E*) is 
an optimal solution, then the optimal At must be 



At* = {J 1 jy 1 J 1 {A* +E* -Do t), 



(13) 



as (A* , E* , At* ) must satisfy the linear constraint in 
([3]). So we have to find an optimization problem to ob- 
tain (A*,E*). 

By plugging (13) into the linear constraint in (J3|, 
one can see that (A* , E* ) must satisfy the following 
constraint: 



J i 4 + J i £ = J 1 (Dor) 



(14) 



where J = I — J(J T J) 1 J T and I is the identity ma- 
trix. So we have an optimization problem for (A* , E*): 



min|U|L+A||B||i, s.t. J- L A+J- L E = J ± (Do T ). (15) 

A,E 

We can prove the following proposition. 



Please refer to (20 I and (23 I. For succinctness, we do not 



repeat it here. Disregarding the adaptive penalty, LADMAP 
is a special case of the generalized ADM ( |Deng and Yin||2012[ ) 
and is closely related to the predictor corrector proximal mul- 
tiplier method ( |Chen and Teboulle||1994| ) which updates vari- 
ables parallelly rather than alternately. 
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Proposition 1 Problems and (15) have the same 
optimal solution (A*,E*). 

Proof. We only have to prove that the constraints for 
(A, E) do not change. The constraints for (A, E) in ([3| 
and ( 15 ) can be written as 



51 = {(A, E)\3At such that D o r + J At = A + E], 
and 

5 2 = {(A, E)\J X A + J X E = J X (D o t)}, 

respectively. If (A, E) £ Si, then we can multiply J to 
both sides of D o t + JZ\t = A + £7 to see that (A, E) £ 
S 2 . If (A,E) e 5 2s then Dot-4-£ € nul^J- 1 ). 
Since null(J ) = span(J), there exists At such that 
Dot- A-E = J At. Thus (A, E) £ Si. □ 

One can easily check that J -1 has the following nice 
properties: 



(J ± f = J 1 - and J ± J X = J X . 

Moreover, denote the operator norm of J- 1 
Then we have 

Proposition 2 | J^j = 1. 



(16) 

1.7-4. 



Proof: From (16) we have (J' 1 ) 2 = J X - So the eigenval- 
ues A of J satisfies A 2 — A = 0. Thus, the eigenvalues 
of J x are either 1 or 0. As J 1 - is symmetric, || J -1 ] = 1. 
□ 



Applying LADMAP ((11 1,(12), and (10)) directly 



to (15) (where x, y, and 7 are A, E, and Y, respec 
tively), with some algebra we have the following updat 
ing scheme: 



A k+ i = argmin \\A\\, 

A 2 



\A-M k \\ 2 F , 



E k+ i = argmin X\\E\\i 

E 



E-N k \\ 2 F , 



E k 



fc+1 



Dot), 



Y k+ i = Y k + ii k J^(A k+ i 
Hk+i = min(/x moa: , p ■ jUfe), 
where 

M fe = A k - J x {A k + E k - D or + Y k /fi k ), 
N k =E k - J x {A k+ i +E k -DoT + Y k /fi k ), 

Umax is an upper bound of /i k and 

po, if p k ■ max(\\A k+1 - A k \\ F , 
/'= \ \\E k+ i-E k \\ F )/\\J x {DoT)\\ F <e 2 , 

1, otherwise, 



(17) 
(18) 

(19) 
(20) 

(21) 
(22) 



(23) 



in which p > 1 is a constant and e 2 > is a small 



threshold. Note that in (17)-(18) we have utilized the 
properties of J x in (16) and \\J X \\ = 1. The updating 



scheme ( 20 ) and ( 23 1 for fi comes from the adaptive 



penalty strategy in LADMAP (Lin et al 2011) 



The closed form solution to (17) is as follows (Cai 
eTail[20l0| : 

A k+ i=S^(M k ), (24) 

f k 

where S(-) is the singular value shrinkage operator: 

S S (W) = UT S {E)V T , (25) 

in which U£V T is the SVD of W and T e (-) is the scalar 
shrinkage operator defined as ( Donoho] 1995): 

T e (x) — sgn(x) max(|x| — e, 0). 



Subproblcm ( 18 ) also has a closed form solution as 



follows (Yang and Zhang 2011): 
E k+ i=T^(N k ). 



(26) 



The iterations (17)-(20) stop when the following two 



conditions are satisfied: 

|| J x (A k+ i + E k+ i - D o t)\\ f /\\J x {D o T )\\ F <ei (27) 
and 

p, k -max(\\A k+ i-A k \\ F , \\E k+ x-E k \\ F )/\\J x (Do T )\\p < e 2 - 

(28) 

These two stopping criteria are set based on the 



KKT conditions of problem (15). Details of the deduc- 



tion can be found in (Lin et al 



2011). 



3.3 Warm Starting Variables in the Inner Loop 

Since the number of iterations in the inner loop greatly 
affects the efficiency of the whole algorithm, we should 
provide good initial values to the variables in the inner 
loop so that the number of iterations can be reduced. 

The original algorithm initializes A as the input Dot 
and E and Y as 0. Such a cold start strategy does not 
utilize any information from the previous loop. We ob- 
serve that solutions from the previous inner loop are 
good initializations for next inner loop, because the 
difference in D o t in successive inner loops becomes 
smaller and smaller. So their solutions should be close 
to each other. This motivates us to use the following 
warm start strategy for the variables: 



Ai+l 
^0 



i+i 




Ko, and Y 



i+l 



(29) 



where the subscripts and superscripts are indices of the 
inner and outer loops, respectively. The subscripts 
and 00 stand for the initial and final values in an inner 
loop, respectively. 

We summarize our LADMAP with variable warm 
start (LADMAP+VWS) approach for solving TILT in 
Algorithm [T] Note that we only change the part of the 
inner loop. The rest of the algorithm is inherited from 



that in (|Zhang et al 2012b) 
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Algorithm 1 (TILT via LADMAP+VWS) 

Input: A rectangular window D g R m x n m an ; ma g ej the 
initial parameters t° of the transform, weight parameter 
X > 0, po > 1 and p ma x > 0. 

Initialize: A = D o r°, E° = 0, Y° = 0, and i = 0. 
While not converged Do 

Step 1: normalize the image and compute the Jaco- 
bian w.r.t. the transform: 

D°t< Jl+1 d t DoC \\ 



D o ■ 



Step 2: solve the linearized convex optimization: 
min ||A||„ + A||£||i, s.t. (J i + 1 ) ± (Vo T i ) = (J i+1 ) ± (A+E), 



with the initial conditions: An 



pi+1 



and Y* +1 

While ([27} or ([28]) are not satisfied Do 

5 

( I, 



i+1 
fc 



^jfe+i = vain{^ m ax,p ■ A*fc). where p is given by |23|, 
k <— fe 



+ 1. 
End While 



Step 3: compute optimal At* using Eq. | |13[ ), where 

1 — -'loo , a — -C'oo , 

Step 4: update transform: t 1+1 = r l + Z\r*, i -f— i + 1; 



End While 

Output: converged solution (A* , E* ,t*) 



4 Implementation Details 

In the previous section we introduce the basic ideas of 
LADMAP with variable warm start (LADMAP+VWS) 



algorithm for solving the inner loop of TILT ( 15 ). How- 



ever, there are still some details that need to be handled 
carefully so that the computation can be the most effi- 
cient. In this section, we first show how to compute with 
J- 1 efficiently, then we discuss how to modify the algo- 
rithm in order to accommodate some additional con- 
straints when putting TILT to real applications. 



is an ran x ran matrix, which is huge and often can- 
not be fit into the memory. Moreover, the computation 
cost will be 0((ran) 2 ), which is also unaffordable. So we 
cannot form J- 1 explicitly and then multiply it with a 
vectorized matrix. Recall that J = I — J(J T J) -1 J T , 
we may overcome this difficulty by multiplying succes- 
sively: 



J-((J T J)-^(J T v)), 



(30) 



whose computation cost is only O(pran). Note that 
(J T J) _1 is only a p x p small matrix which can be 
pre-computed and saved. So the above strategy is much 
more efficient in both memory and computation. 



4.2 Initializing Y 

In Algorithm]!] we have to multiply ( J l+1 )^ with three 
different matrices. If we initialize V I+1 in the subspace 
spanned by (J 1 * 1 )- 1 , then T fc i+1 is always in the sub- 
space spanned by (J l+1 )^ during the iteration. Then 
thanks to the idempotency of ( J 1 " 1 " 1 )- 1 , we have ( J l+1 ) ± Y k t+1 
Y k l+1 and hence A and E can be updated as 



- (J*+1)X (A <+1 + —Dot*) — , 

. - (J i + 1 ) i (4+ 1 1 + Ej+i —Dor*) — PI'Y^) . 



Then one can see that we only have to multiply ( J^ 1 ) 1 - 
with two different matrices, A\ +1 + E l k +1 -Dot 1 and 



Al+l 



■ E\ +1 ~Dot\ because ( J i + 1 )- L ( J 4*+ 1 1 + Ej+\ 



Dot 1 ) is used for updating both Y^+J and Ajj+j- This 
saves one multiplication of multiplying with (J' +1 ) ± for 
each iteration. 

To combine with the warm start technique, Y should 
be initialized as 



Y o +1 = (J l+ TY^ 



(31) 



Because the dual problem of the inner loop ( 15 ) is 



4.1 Multiplying with J 1 - 



max( J-t-Y, D o T ) s.t. ||J r - L K|| 2 < 1, ||J- L y|| 0o < A -1 , 



In Algorithm [T] J is actually an order-3 tensor ( Zhang 
et al 2012b[ ). When computing J is actually arranged 
in an ran x p matrix, where mxnis the size of Dot 
and p is the number of variables to parameterize the 
transform r. When multiplying J- 1 with a matrix M 
we actually mean to rearrange the matrix M into an 
ran vector and then multiply it with J- 1 . However, J 1 - 



where || • ||oo is the maximum absolute value in a matrix, 
we see that the optimal Y can be chosen in span(J^). 
So constraining Y in span( J 1 -) does not affect the con- 
vergence of LADMAP+VWS to the optimal solution. 
As Y^ e_span((J 4 )- L ), when J t+1 is close to J\ Y* +1 w 



Y^. So (31 1 makes good combination of warm starting 



Y and making Y £ span (J ). 



Title Suppressed Due to Excessive Length 



7 



4.3 Handling Additional Constraints on r 



As is discussed in (Zhang et al 2012b), additional con 



straints should be imposed on r so as to eliminate de- 
generate or trivial solutions, e.g., r* being 0. Typical 
constrains include that both the center and area of the 
rectangle being fixed. These additional constraints can 



be formulated as I linear constraints on At ( Zhang et al 



2012b): 



Q-At = 0, (32) 

where Q € R lxp . 

Following the same idea as that in Section 3.2 we 
aim at eliminating At with the I additional constraints. 
As At needs to satisfy both the linear constraints in Eq. 
( 32 ) and that in problem (J3j) , the overall constraints for 
At are 



Yfc = W T Y k and initialize it in span(W T W). Then M k , 
N k and Y k are computed as follows: 

M k = A k - ^Y k - W T W(A k +E k -Do T ), 
N k = A k+1 - ^Y k - W T W(A k+1 + E k -Do T ), 
Y k+l = Y k + fi k ■ W T W(A k+1 + E k+1 -Dot). 

(36) 

Again, W T W(A k+ i + E k+ i — D o r) is used to update 
both M k+ \ and Y k+ i. In this way, the iterations for the 
inner loop can be computed very efficiently. 

Now the warm start of Y is replaced by that of Y, 
which is: 



f i+1 = (W i+1 ) T W i+1 Y^. (37) 

As I <C ran, [W l+1 ) T W l+1 is actually very close to 
) . So (37) is both a good combination of warm 
starting Y and making Y e span(W T W). 



'J' 




At = 


Q. 



A + E-Dot 




(33) 

5 Warm Starting for SVD in the Inner Loop 



Similar to the deductions in Section |3.2[ we can have 
an equivalent problem: 



As shown in (24)-(25), to update A we have to compute 



the SVD of M k . Unlike other low-rank recovery prob- 



min II Al 

A,E " " 



A||S||i, s.t. WA + WE = W(D ot), (34) 2009), partial SVD cannot be used here. This is be- 



lems (Cai et al 2010 Lin et al 2009 Toh and Yun 



cause partial SVD is faster than full SVD only when the 



where 
W = 



I-J(J T J + Q T Q)- 1 J T 
-Q(J T J + Q T Q)- 1 J T 



Matrix W also enjoys a nice property similar to J 



_L. 



W r W = 1- J {J 1 J + Q T Qy 1 J T , 



(35) 



which can be utilized to reduce the computational cost. 
Moreover, \\W\\ 2 = 1. Then with some algebra L ADM AP 
applied to the above problem goes as follows: 

A k+1 = S^{M k ), 
E k+1 =T2(N k ), 

Y k+1 = V fe + Mfe • W(A k+1 + E k+1 -Dot), 
ptfc+i = min (p • (i k ,Li max ), 

where 

M k =A k - [i- k x W T Y k - W T W(A k +E k -Do T ), 
N k = A k+1 - ^W^k - W T W(A k+1 +E k -Do T ), 



and p is still computed as ( 23 ) 



Thanks to ( 35 ) , the multiplication of W W with 



a vectorized matrix can be done similarly as ( 30 1 . To 



further reduce the computational cost, we introduce 



rank of A k+ i is less than min(m, n)/5 (Linetal 2009), 
while when rectifying general textures this condition is 
often violated. So computing the full SVD of M k is very 
costly as its complexity is 0(mnmin(m,n)). Without 
exaggeration, the efficiency of computing the full SVD 
dominates the computing time of solving TILT. So we 
have to reduce the computation cost on full SVD as 
well. 

We observe that, except for the first several steps in 
the inner loop, the change in M k between two iterations 
is relatively small. So we may expect that the SVDs of 
M k and M k -\ in two successive iterations may be close 
to each other. This naturally inspires us to utilize the 
SVD of M k -i to estimate the SVD of M k . 

To do so, we first formulate the SVD problem as 
follows: 

(U*,E*,V*) = aigmm UiEiV F(U, S, V), 

(38) 

s.t. U T U = 1, S is diagonal, and V T V = I, 

where F(U, S, V) = -\\M - USV T \\ 2 F and M € R mxn 

is the matrix to be decomposed. Without loss of gener- 
ality we may assume m > n. U € ]R mXTl and V € R nxn 
are columnly orthonormal and orthogonal matrices, re- 
spectively. As we can negate the columns of U or V, 
we need not require the diagonal entries of S to be 
nonnegative. 
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5.1 Optimization with Orthogonality Constraints 

The usual method to solve a general orthogonality con- 
strained optimization problem is to search along the 
geodesic of the Stiefel manifold |^] along the direction of 
the gradient of the objective function projected onto the 
tangent plane of the manifold ( |Edelman et al 1999). 
This may require SVDs in order to generate feasible 



points on the geodesic. Fortunately, Wen and Yin (Wen 



and Yin 2012 ) recently developed a technique that does 



not rely on SVDs, making our warm start for SVD pos- 
sible. 

Denote the unknown variable as X. Suppose the 
gradient of the objective function at X is G, then the 
projection of G onto the tangent plane of the Stiefel 
manifold at X is P = GX T - XG T dEdelman et al 



1999 Wen and Yin 2012). Instead of parameterizing 



the geodesic of the Stiefel manifold along direction P 



using the exponential function, Wen and Yin (Wen and 



Yin 2012) proposed generating feasible points by the 



Cayley transform: 

X(t) = C(r)X, where C(r) = (/ + T -P) ~* (/ - T -P 

It can be verified that X(t) has the following properties: 

1. X(t) is smooth in r; 

2. (X(t)) T X(t) = 7, Vt € K, given X T X = 7; 

3. X(Q) = X; 

4 - 5^( ) = -P- 

So when r > is sufficiently small, X(t) can result in 
a smaller objective function value than X. 

X{t) could be viewed as reparameterizing the geodesic 
with r, which does not exactly equal to the geodesic 
distance. However, when r is small it is very close to 
the geodesic distance as it is the length of the two 



segments enclosing the geodesic (Wen and Yin 2012) 



When computing X(t), no SVD is required. A matrix 
inversion and some matrix multiplications are required 
instead, which is of much lower cost than SVD. How- 
ever, as both matrix multiplication/inversion and SVD 
are of the same order of complexity, we have to control 
the number of matrix multiplications and inversions, so 
that our warm start based method can be faster than 
directly computing the full SVD. 

5.2 SVD with Warm Start 



Now for our SVD problem (38), we can compute the 



with respect to (U, S, V) and search on a geodesic on 
the constraint manifold Cu x Cs x Cy in the gradient 
direction for the next best solution, where Cu is the 
Stiefel manifold of all columnly orthonormal to x n ma- 
trices, Cx: is the subspace of all n x n diagonal matrices, 
and Cy is the Stiefel manifold of all n x n orthogonal 
matrices. 

We search on the constraint manifold on the follow- 
ing curve: 



U(t) = (1+^Pu) l {l-\Pa)U, 
£{t) =E-t-P s , 

V(t) = (/+|P v )" 1 (7-|Py)V, 



(39) 



where P v = G V U T -UGjj and P v = G V V T -VG V are 
the projection of Gjj and Cy onto the tangent planes 
of Cu and Cy at U and V, respectively, and Pg = 
diag(G.s) is the projection of Gs onto Cs- The details 
of computing the gradients can be found in Appendix. 
Then we may find the optimal t* such that 



1 



t* = argmin/(t) = -\\M - U(t) ■ E(t) ■ V(ty \\ Z F . (40) 
t £ 

As fit) is a complicated function of t, the optimal t* has 
to be found by iteration. This will be costly as many 
matrix inversions and multiplication will be required. 
So we choose to approximate fit) by a quadratic func- 
tion via Taylor expansion: 



/(*)«/(0) + /'(0) 



t + lf"(o) 



(41) 



where /'(0) and f"(0) are the first and second order 
derivatives of f(t) evaluated at 0, respectively. These 
two derivatives can be computed efficiently. Details of 
the deductions can be found in Appendix. Then we can 
obtain an approximated optimal solution i* = — /'(0)//"(0) 
and approximate the SVD of M as U(i* )S(t*) V(t*) T . 

The warm start SVD method is summarized in Al- 
gorithm [2j It is called only when the difference between 
Mfc and Mk-i is smaller than a pre-defined threshold 

£ svd ■ 

Although U(t*)S(t*)V(i*) T is an approximate SVD 
of M, our final goal is to compute the singular value 
shrinkage of Mk in order to update Ak+i (see (17), 



gradient (Gy, Gz, Gy) of the objective function F(U, S, V) 

3 A Stiefel manifold is a set of columnly orthonormal ma- 
trices whose geodesic distance is induced from the Euclidian 
distance of its ambient space. 



p4| ) and ((25])), not the SVD of M k itself. We can show 
that when Mk is close enough to Mk-i, computing the 
SVD of M fe approximately still produces a highly ac- 
curate j4fc+i- The corner stone of our proof is the fol- 
lowing pscudocontraction property of the singular value 
shrinkage operator: 



||s e TO - s e (w 2 )\\% < ww, 



w 2 \\% 



l&TO-Wi]- [S s (Wa) 
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Algorithm 2 (SVD with Warm Start) 

Input: The decomposed matrices U^-i, an d Vk—i °f 

the SVD of Mk-i, and a matrix M k . 

If \\M k - M k _x\\ < e avd (Else do full svd) 

Step 0: M = M k , U(0) = U k -i, Z{0) = and 
V(0) = V k - 1 . 

Step 1: compute the projected gradients Pjj, Pg, and 
P v of F at U(Q), 17(0), and 1/(0) using |42| to respec- 
tively. 

Step 2: compute /'(0), /"(0), and t* = -/'(0)//"(0) 
using |45} to p7| . 

Step 3: compute U k = U(t*), S k - £{t*), V k = V(t*) 
using p9] ). 

Output: C/fc, E k , and Vfc as the decomposed matrices in the 
SVD of M k . 



thanks to Lemma 3.3 of (Pierra 1984) and the fact that 



S e (-) is the proximal mapping of the nuclear norm (Cai 



et al 2010). Then we have: 



\\S e (U(t*)E(t*)V(t*) T ) - S £ (M k )\\ F 

< \\U(t*)£(t*)V(t*) T - M k \\ F 

< \\U{0)S(0)V(Q) T - M k \\ F = ||M fc _i - Mfc|| F . 

Since we switch to our warm start technique for SVD 
in the inner loop only when M k is very close to M k -i 
(i.e., H-Mfe-i — M k \\ < e sv d), it is guaranteed that: 



\\S e (U(t*)E(t*)V(t*) T ) ~ S £ (M k )\\ F < e svd . 

Hence our approximate SVD for M k still results in a 
highly accurate A k+ \. 



6 Experiments 

In this section, we conduct several experiments on both 
the inner loop only and the complete TILT problem to 
evaluate the performance of our proposed LADMAP 
with warm starts method. Numerical experiments on 
the inner loop only are conducted by using synthetic 
data in order to demonstrate the effectiveness of LADMAP 
and the warm start for SVD. For the complete TILT 
problem, we conduct experiments on images with simu- 
lated and real deformations to further test the efficiency 
and robustness of LADMAP and the warm start tech- 
niques. The images we use are from a low-rank textures 
data set and a natural images data set, respectively. 
The code for the original ADM based method is pro- 



vided by the first author of ( Zhang et al 2012b ). For fair 



comparison, we set the common parameters in all com- 
pared methods the same. All the codes are in MATLAB 
and the experiments are run on a workstation with an 
Intel Xeon E5540@2.53GHz CPU and 48GB memory. 



6.1 Numerical Study on the Inner Loop Only 

In this section, we use synthetic data to compare the 
efficiency of the original ADM based algorithm, our 
LADMAP based algorithm, and LADMAP with warm 
start for computing SVD (LADMAP+SVDWS) on the 
inner loop only. As this time we only focus on the in- 
ner loop, the effectiveness of variable warm start, which 
requires outer loops, cannot be shown. We generate 
Dot and Jacobian J randomly and use them as the 
input for ADM, LADMAP, and LADMAP+SVDWS. 
For fair comparison, we tune the extra parameters £2 
and p in LADMAP and LADMAP+SVDWS and e svd 
in LADMAP+SVDWS so that the three methods stop 
with almost the same objective function values. All 
methods are initialized as A® = D o r° and E® = 0. 
Atq in the ADM method is initialized as At = 0. Un- 
der these conditions, we compare the three methods 
on their computation time, the number of iterations 
needed to converge, and the objective function value 
when the iterations stop. The comparison is done un- 
der different sizes of D o r. All the tests are repeated 
for 10 times and the average quantities are reported. 

The comparative results are shown in Table [T] We 
can observe that all the three methods arrive at roughly 
the same objective function values. However, LADMAP 
uses less than half of the number of iterations than 
ADM does and the SVD warm start only changes the 
number of iterations very slightly. Consequently, LADMAP 
is much faster than ADM, while SVDWS further speeds 
up the computation of LADMAP when the size of D o r 
is not too small (e.g., > 100). The acceleration rate also 
increases when the size of Dot grow fjf] When the size of 
Dot is very small (e.g., < 100), SVDWS does not seems 
to speed up the computation. This may due to the ultra- 
short computing time and hence other processes on the 
workstation supporting the computing environment can 
influence the total computing time. Anyway, the slow- 
down is rather insignificant. As the speed of solving 
large sized TILT is more time demanding in real appli- 
cations, adopting SVDWS is still advantageous. 



6.2 Comparisons on the Complete TILT Problem 

In this subsection, using real image data we compare 
the original ADM method, LADMAP, LADMAP with 
variable warm start (LADMAP+VWS), and LADMAP 
with both variable warm start and SVD warm start 
(LADMAP+VWS+SVDWS) on solving the whole TILT 
problem, in order to show the effectiveness of LADMAP 

4 We will see much higher acceleration rates when using 
SVDWS on real data (see Section 6.2 1. 
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Table 1 Comparison of the efficiency of algorithms on the inner loop of TILT only. The time costs, numbers of iterations, 
and optimal objective function values are averaged over 10 trials on random data. 



Size of Dot 


ADM 


LADMAP 


LADMAP+SVDWS 




time (s) 


#iter. 


obj. func. 


time (s) 


#iter. 


obj. func. 


time (s) 


#iter. 


obj. func. 


10 x 10 


1.98E-05 


49.2 


6.4836 


7.49E-06 


20.4 


6.4814 


8.76E-06 


20.9 


6.4815 


50 x 50 


0.00112 


51.5 


76.6508 


0.00054 


21.2 


76.6468 


0.00056 


22.2 


76.6585 


100 x 100 


0.00538 


51.1 


217.001 


0.00252 


22.2 


216.985 


0.00225 


22.9 


216.987 


300 x 300 


0.1859 


50 


1123.66 


0.10501 


21.9 


1123.67 


0.08792 


22.9 


1123.70 


500 x 500 


1.3802 


50 


2420.41 


0.6574 


22 


2419.63 


0.5343 


21 


2420.28 


1000 x 1000 


53.0936 


50 


6844.82 


25.8139 


23 


6844.14 


18.5953 


23 


6845.32 




Fig. 2 Range of convergence for affine transform. The x- 

axis and y-axis stand for the rotation angle 9 and the skew 
value t in an affine transform, respectively. Regions inside the 
curves are affine transforms that are successfully recovered in 
all trials. The blue dashed curve and the red solid curve are 
the boundaries of the parameters of affine transforms that 
can be successfully recovered by ADM and LADMAP, re- 
spectively. 

and the two warm start techniques. As we have pointed 
out before that the original ADM may not converge to 
the optimal solution of the inner loop, we first compare 
the convergence performance of ADM and LADMAP. 
Then we test the robustness of ADM and LADMAP 
when there are corruptions. We also present some ex- 
amples on which ADM fails but our LADMAP works 
well. Finally, we compare the computation time of var- 
ious methods on both synthetically and naturally de- 
formed images. The synthetically deformed images are 
generated by deforming the low-rank textures with pre- 
defined transforms. The naturally deformed images are 
from our images data set which contains over 100 im- 
ages downloaded from the web. 



Range of Convergence Since LADMAP for the inner 
loop is proven to converge to the optimal solution, we 
expect that it will outperform ADM in recovering the 
correct deformation when solving the whole TILT prob- 
lerrQ To show that LADMAP can recover broader range 
of deformation than ADM does, we test them with a 
standard checker-board pattern. 

5 Note that as the whole TILT problem (J^jl is not a con- 
vex program, LADMAP cannot achieve the global optimum 
either. So LADMAP can also fail to recover the correct de- 
formation. 
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Fig. 3 Visual comparison between the results by ADM 

and LADMAP. First and second columns: rectification 
results by ADM. Third and fourth columns: results by 
LADMAP. First and third columns: rectified regions of in- 
terest. Second and fourth columns: the initial transform (il- 
lustrated by red rectangles which are manually prescribed) 
and the computed transform (illustrated by green quadrilat- 
erals). 



Following the same setting in (Zhang et al 2012b), 
we deform a checker-board pattern by an affine trans- 
form: y — Ax+b, where x,y £ R 2 , and test if the two al- 
gorithms can recover the correct transform. The matrix 

cos 9 —sin 9 
sin 9 cos 9 

where 9 is the rotation angle and t is the skew value. We 
change 9 within the range [0, tt/6] with a step size 7r/60, 



A is parameterized as A(9,t) 





'it' 




1 
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and t € [0, 1] with a step size 0.05. We can observe from 
Fig. [2] that the range of convergence of our LADMAP 
completely encloses that of ADM. So the working range 
of LADMAP is larger than that of ADM. 

We further test with real images taken from natural 
scenes and manually prescribe the regions to be rec- 
tified. We have found many examples that LADMAP 
works better than ADM. However, we have not encoun- 
tered any example that ADM works better. Part of 
the examples are shown in Fig. [3j They are chosen ac- 



cording to the challenging cases listed in ( Zhang et al 



2012b) for TILT to rectify. For example, the first ex- 



ample is lacking regularity in the printed texts or the 
prescribing rectangular contains too much background; 
the second example has very large deformation; and 
the third example may has a boundary effect. On these 
examples, LADMAP works much better than ADM in 
rectifying the prescribed regions. 



Robustness under Corruption In this subsection, we com- 
pare the robustness of ADM and LADMAP when there 
is corruption in images. We test with some low-rank 
textures shown in Fig. m a) . Following the same setting 
in (Zhang et al 2012b), we introduce a small deforma- 
tion, say rotation by 10°, and examine whether the two 



methods can recover the correct transform under dif- 
ferent levels of random corruption. We randomly select 
a fraction (from 0% to 100%) of the pixels and assign 
all their RGB values as random values uniformly dis- 
tributed in (0,255). We run the two methods on such 
corrupted images and record at each level of corruption 
how many images are correctly rectified. 

The comparative results are shown inFig.gb). We 
can see that when the percentage of corruption is larger 
than 15%, our LADMAP always outperforms ADM. 
For example, when 50% of the pixels are corrupted, 
LADMAP can still obtain the correct solution on more 
than half of the test images, while ADM can only deal 
with about 30% of the images. The above comparison 
demonstrates that our LADMAP method is more ro- 
bust than the original ADM method when there is cor- 
ruption in images. 



Speed of the Algorithms In this subsection, we report 
the computational cost of four different algorithms: ADM, 
LADMAP, LADMAP+VWS, and LADMAP+VWS+ 
SVDWS for solving the whole TILT problem, in or- 
der to show the computational improvement from each 
component. 

The comparisons are done on two types of images. 
The first type of images are obtained by applying affine 
transforms to real images. The second one are images 



of natural scenes that can undergo either affine or pro- 
jective transform, in which the rectangular regions to 
be rectified are manually prescribed. 

For the first type of images, part of the comparison 
results, namely the images are rotated with 9 = 30° 
only or skewed with t = 0.2 only, are presented in 
Table [2j We tuned the four algorithms so that all of 
them produced nearly the same relative error, which 
is defined as ^TT^rp where r* is the optimal solu- 
tion produced by the corresponding algorithm and tq is 
the ground-truth deformation matrix. We can see that 
LADMAP is much faster than ADM, the variable warm 
start speeds up the convergence, and SVD warm start 
further cuts the computation cost. 

For the second type of images, comparison of the 
speed on ten of the images is shown in Table [3} We 
can see that LADMAP can be at least 20% faster than 
ADM, with the variable warm start LADMAP can be 
at least 2.5 times faster than ADM, and with the SVD 
warm start further added, LADMAP can be at least 
4.5 times faster than ADM. So the speed advantage of 
our new algorithm is apparent. 



7 Conclusions 

In this paper we propose an efficient and robust algo- 
rithm for solving the recently popular TILT problem. 
We reformulate the inner loop of TILT and apply a re- 
cently proposed LADMAP to solve the subproblem of 
the inner loop. For further speed up, we also propose 
variable warm start for initialization and introduce an 
SVD warm start technique to cut the computational 
cost of computing SVDs. Extensive experiments have 
testified to the better convergence property, higher ro- 
bustness to corruption, and faster speed of LADMAP, 
and the effectiveness of our warm start techniques. 



Acknowledgements Z. Lin is supported by the National Nat- 
ural Science Foundation of China (Grant No.s 61272341 and 
61231002). 



Appendix A: Details of Computing SVD with 
Warm Start 

In this section, for brevity we simply write U(0), I?(0), 
and V(0) as U, E, and V, respectively. 
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Fig. 4 Robustness of ADM and LADMAP under different levels of corruption, (a) Sample textures, (b) Success rates 
of ADM and LADMAP under different levels of corruption. 



7.1 Gradients 



where 



The gradients are: 
OF 
oU 

OF , rp 

G v = — = V2 2 - M T U2, 



dV 



(42) 
(43) 
(44) 



Q 771 

G s = — = 2 - U T MV. 
oh 

The projected gradients are: 

P v = G V U T - UGjj = (USV T )M T - M(USV T ) T , (45) 
P v = G V V T - VG% = (USV T ) T M - M T (USV T ), (46) 
P s = diag(I7 - U T MV). (47) 

Note that USV T is just the SVD of the matrix in 
last iteration. So we do not need to recompute their 
products. 



7.2 Derivatives of f(t) 
7.2.1 First-order Derivative 

We utilize the chain rule to calculate the derivative, let 
H = h{t) = M - U{t) ■ 2{t) ■ V(t) T 
and 1(H) = \\\Hf F , then 

no -m-« 



dt 



/ d£(H) \ T dh(t) 
\ dH J ' dt 



tr h(t) T ■ h'(t) 



h'(t) = -U'(t)2(t)V{t) T - U{t)S'(t)V(t) T 
-U(t)2(t)V'(t) T , 

and 

U'(t) = -(I + \tPu)- l \Pu{I + ItPu)- 1 ^ - \tPu)U 
-{I+ltP^lPuU, 

V'(t) T = V T \P v (I-\tP v )^ 

+V T {I+\tP v )(I-\tPv)- 1 \Pv{I-\tP v )- 1 . 

1.2.2 Second-order Derivative 

Since d(tr(Q)) = tr(dQ) and dQ T = (dQ) T , we have 



/"(*) 



where 



dt 



tr 



ti(ty -ti(t) + h(ty -h"{t) 



h"{t) = 

= -U"(t)2(t)V(t) T - U'(t)2'(t)V(t) T - U'(t)2(t)V'(t) T 
-U'(t)S'{t)V(t) T - U{t)E"(t)V(t) T - U(t)E'(t)V'(t) T 
-U'(t)S(t)V'(t) T - U{t)2'{t)V'(t) T - U(t)2(t)V"(t) T 
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Table 2 Comparison of speed on images with artificial afRne deformations. The time costs are counted in seconds. The 
relative errors are between the computed r and the ground truth. 







ADM 


LADMAP 


LADMAP+VWS 


LADMAP+VWS+SVDWS 


e = 30° 


size 


time (s) 


rel. err. 


time (s) 


rel. err. 


time (s) 


rel. err. 


time (s) 


rel. err. 


i 


50x43 


6.6318 


0.1563 


2.2941 


0.1561 


1.1857 


0.1561 


0.4034 


0.1560 


2 


31x36 


0.2503 


0.0086 


0.1834 


0.0087 


0.0913 


0.0085 


0.0063 


0.0085 


3 


38x37 


0.1594 


0.00087 


0.1057 


0.00087 


0.0459 


0.00088 


0.0338 


0.00088 


4 


28x26 


0.1091 


0.0040 


0.0634 


0.0040 


0.0408 


0.0039 


0.0353 


0.0039 


5 


38x45 


3.139 


0.4226 


1.977 


0.4219 


0.420 


0.4193 


0.199 


0.4191 


t = 0.2 


size 


time (s) 


rel. err. 


time (s) 


rel. err. 


time (s) 


rel. err. 


time (s) 


rel. err. 


1 


27x27 


0.8926 


0.0252 


0.3068 


0.0245 


0.1605 


0.0243 


0.0995 


0.0247 


2 


47x50 


3.2381 


0.2066 


2.1583 


0.2082 


0.4551 


0.1944 


0.1988 


0.1949 


3 


47x45 


0.6271 


0.0065 


0.5043 


0.0065 


0.3914 


0.0067 


0.3124 


0.0066 


4 


30x26 


2.4916 


0.1896 


1.3644 


0.1881 


0.9039 


0.1859 


0.6126 


0.1849 


5 


26x28 


0.3681 


0.0538 


0.2254 


0.0536 


0.1622 


0.0536 


0.1274 


0.0539 



Table 3 Comparison of computation speed on real images. The time costs are counted in second. The speedup rates are 
w.r.t. the baseline ADM method. 



Case 


size 


ADM 


LADMAP 


LADMAP+VWS 


LADMAP+VWS+SVDWS 






time (s) 


time (s) 


speedup 


time (s) 


speedup 


time (s) 


speedup 


1 


82x82 


6.2728 


4.8299 


1.29 


1.8119 


3.46 


1.1386 


5.50 


2 


84x78 


7.6238 


2.8338 


2.69 


1.6067 


4.75 


1.3056 


5.84 


3 


80x87 


5.4040 


2.4863 


2.17 


1.3238 


4.08 


0.9587 


5.64 


4 


68x72 


4.7575 


2.8580 


1.66 


0.9953 


4.78 


0.8237 


5.78 


5 


51x60 


3.1128 


2.7048 


1.15 


0.8821 


3.53 


0.4906 


6.35 


6 


89x90 


3.3004 


2.2573 


1.46 


0.8274 


3.99 


0.5174 


6.38 


7 


96x107 


6.7319 


2.2305 


3.02 


1.6650 


4.04 


0.8212 


8.20 


8 


65x60 


2.5367 


1.4115 


1.79 


0.5022 


5.05 


0.4172 


6.08 


9 


72x71 


2.5501 


0.9307 


2.73 


0.6234 


4.09 


0.3967 


6.43 


10 


83x90 


4.3284 


3.6923 


1.17 


1.1078 


3.91 


0.7141 


6.06 



and 

U"(t) = 

= §(/ + ^tPuy^Puii + itPu^Puii + htPu)-Hi - \tp v )u 

+ \{I+\tP u )- 1 P u {I+\tP u )-^P u U, 
Z"(t) = 0, 

v"(t) T = dv d ^ 

= lv T (I+±tP v )(I~ Up v )-*P v {I- UPv)- l Pv{I- \ 



However, /(0) need not be computed as we are only 
interested in /'(0) and /"(0) for obtaining t* . 

we have 



tP v 



+ lV T P v (I-±tP v )- 1 P v (I- \tP v 



7.3 Taylor Expansion 

We want to do the second-order Taylor expansion of 
f(t) at t = 0: 

f(t)*f(0) + f'(0)-t+±f"(0)-t 2 . 
First, the function value at is: 



f(0) = -\\M-UEV 1 \\ 2 F . 



Second, since f'(t) = tr h(t) T ■ h'(t) 
/'(0)=tr[M0) T ■ft'(O) 

where 

h(0) = M - U{0)S{0)V(0) T , 
V(0) = -U'(0)Z(Q)V(0f - U(0)S'(0)V(0) T 
-U(0)S(0)V'(0) T . 

Given U(Q) = U, S(0) = S, V(0) = V, U'(0) = -P n U, 
Z"(0) = -P s ,V'(0) = -P V V, we have 

h'(0) = Pu(USV T ) + UP S V T - {USV T )P V . 

Finally, we come to the second-order derivative: 

/"(0) = trk'(0) T • fc'(0) + h(0f ■ h"(0)] , 

where 
h"(0) 

= -C/"(0)i7(0)t/(0) T - J7'(0)I7'(0)F(0) T - t/'(0)i7(0)V'(0) T 
-U'(0)S'(0)V(0) T - U(0)Z"(0)V(0) T - U(0)S'(0)V'(0) T 
-U'(0)S(0)V'(0) T - U(0)Z'(0)V'{0) T - U(0)S(0)V"{0) T , 
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with 17" (0) = P$U, E"(0) = and V"(0) = P^V. 
Thus we have 

h"(0) = -P$U£V T - 2 • PuUP s V T + 2 • P V UJ:V T P^ 
+2 • UP S V T P V - UZV T Pfr 
= -PuZ + ZPy, 

where Z = h'(0) + UP S V T . 
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